Robust Text Detection from Binarized Document Images

نویسندگان

  • Oleg Okun
  • Yu Yan
  • Matti Pietikäinen
چکیده

Many document images are rich in color and have complex background. To detect text from them, a standard approach utilizes both color and binary information. This often leads to time-consuming processing and requires a lot of parameters to be tuned. In contrast, we propose a new method for text detection using a binary image alone. The main virtues of our method include detection of both normal and inverted text and robustness to various font types, styles and sizes and small skew angles, combined with a moderate number of free parameters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Image Binarization Technique for Degraded Document Images

Document image binarization is a vital pre-processing technique for document image analysis that segments text from badly degraded document images. In this paper, we propose a robust document image binarization technique that is based on the concept of adaptive image contrast. The adaptive image contrast which is formed by combining local image contrast and the local image gradient makes it tol...

متن کامل

Document Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)

Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...

متن کامل

A Robust Document Image Binarization Technique for Degraded Document Images

Segmentation of text from badly degraded document images is a very challenging task due to the high inter/intravariation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The adaptive image contrast is a combination of the loca...

متن کامل

Adaptive Image Contrast with Binarization Technique for Degraded Document Image

----------------------------------------------------ABSTRACT--------------------------------------------------Segmentation of text from badly degraded document images is very challenging tasks due to the high inter/intra variation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that add...

متن کامل

Stroke Width-Based Contrast Feature for Document Image Binarization

Automatic segmentation of foreground text from the background in degraded document images is very much essential for the smooth reading of the document content and recognition tasks by machine. In this paper, we present a novel approach to the binarization of degraded document images. The proposed method uses a new local contrast feature extracted based on the stroke width of text. First, a pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002